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ABSTRACT 

The relationship of multiple linear regression to 
various multivariate statistical techniques is discussed. The 
importance of the standardized partial regression coefficient (beta 
weight) in multiple linear regression as it is applied in path, 
factor, LISREL, and discriminant analyses is emphasized. The 
multivariate methods discussed in this paper have in common the 
general linear model and are the same in several other respects: (l) 
they identify, partition, and control variance; (2) they are based on 
linear combinations of variables; and (3) the linear weights can be 
computed based on standardized partial regression coefficients. 
However, these methods have different applications. While multiple 
regression seeks to identify and estimate the amount of variance in 
the dependent variable attributed to one or more independent 
variables, path analysis attem ,s to identify relationships among a 
set of variables. Factor analysis tries to identify subsets of 
variables from a much larger set. The LISREL program determines the 
degree of model specification and measurement error. Discriminant 
analysis seeks to identify a linear combination of variables that can 
be used to assign subjects to groups. An understanding of multiple 
regression and general linear model techniques can greatly facilitate 
one*s understanding of the testing of research questions in 
multivariate situations. Eight appendices contain computer program 
examples based on correlational input as illustrations of these 
methods. A 47-item list of references is provided. (SLD) 
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Preface 

The appropriate statistical method to use is often an 
issue of debate. It sometimes requires more than one 
approach to analyze data. The rationale for choosing 
between the alternative methods of analysis is usually 
guided ly: 

a. pu:tpose of the research 

b. research hypothesis or question 

c. mathematical characteristics of the variablas 

d. sampling procedures 

e. statistical assumptions 

f . model validity 

Multiple linear regression as a general linear model 
technique provides an excellent educational framework in 
which to analyze univariable and multivariable research 
questions (Newman^ 1988) . The present paper extends the 
relationship of multiple linear regression to various 
multivariable techniques: path, factor, LISREL, and 
discriminant analyses. The primary focus of which is to 
indicate the use of the standardized partial regression 
coefficient (beta weight) in these multivariable techniques. 

This paper did not concern itself with issues of 
standardized versus unstandardized regression coefficients. 
Type I and Type II error rates, R-square shrinkage, 
suppressor variables, number of predictors, 
multicollinearity, curvilinearity and trend analysis, and 
many other issues related to general j.inear model research. 
Although, model specification and measurement error were 
addressed as an advantage of the LISREL approach. 
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INTRODUCTION 

Multiple Regression or the general linear model approach 
to the analysis of experimental data in educational research 
has become increasingly popular since 1967 (Bashaw and 
Findley^ 1968) . In fact today ^ it has become recognized as 
an approach that bridges the gap between correlational and 
analysis of varian6e thought in answering research 
hypotheses (McNeil, Kelly, & McNeil, 1975) . Statistical 
textbooks in psychology and education often present the 
relationship between data analysis with multiple regression 
and analysis of variance (Draper & Smith, 1966/ Williams, 
1974/ Roscoe, 1975/ Edwards, 1979) . Graduate students 
taking an advanced statistics course are . therefore provided 
with the multiple linear regression framework for data 
analysis. Given their understanding of multiple linear 
regression techniques applied to univariate analysis (one 
dependent variable) , their understanding can be extended to 
the relationship of multiple linear regression to various 
multivariate statistical ^hniques (Kelly, Beggs, McNeil, 
with Eichelberger & Lyon, i:?69, pps 228-248) . The present 
paper will expand upon this understanding and indicate the 
importance of the standardized partial regression 
coefficient, (beta weight) in multiple linear regression as 
it is applied in path, factor, LISREL and discriminant 
analyses « 
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MULTIPLE REGRESSION 

Multiple Regression techniques require a basic 
understanding of sample statistics (n, mean, and variance) , 
standardized variables, correlation (Pedhazur, 1982, pp 53- 
57), and partial correlation (Cohen & Cohen, 1975; Houston & 
Holding, 1974) . In standard score form the multiple 
regression equation is: 

A 

z « p z 

y X 

The relationship between the correlation coefficient, the 
unstandardized regression coefficient and the standardized 
regression coefficient is: 



2 z z s 

X y X 

p « « b « r 

2 s xy 

Z z y 

X 



For two independent variables, the regression equation with 
standard scores is: 

A 

z » p z + p z 
y 11 2 2 

And the standardized partial regression coefficients are 
computed by: 



r-rr r-rr 

yl y2 12 y2 yl 12 

P « P « 

1 2 2 2 

1 - r 1 - r 

12 12 
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The correlation between the original and predicted 
scores is giveA the special name Multiple Correlation 
Coefficient, It is indicated as: 

R A =» R 

y y y.i2 

And the Squared Multiple Correlation Coefficient is related 
as follows: 

2 2 

Ra=R =pr+pr 

y y y.i2 i yi 2 y2 

MULTIPLE REGRESSION EXAMPLE 

A multiple linear regression example using a correlation 
matrix as input (SPSSX User's Guide, 3rd Edition, 1988, 
Chapter 13) is provided in Appendix A. The results are: 

2 

R =pr +pr +pr 

y.l23 1 yl 2 y2 3 y3 

= (.423) .507 + (.363) .481 + (.040) .276 

2 

R = .40 

y.l23 

A systematic determination of the most important set of 
variables can be accomplished by setting the partial 
regression weight of each variable to zero. This approach 
and other alternative methods are presented by Kelly (1969) 
and Darlington (1968) . 
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In sununary, regression techniques have been shown to be 
robust (Bohrnotedt & Carter, 1971) / applicable to contrast 
coding (Lewis & Mouw, 1978) ; dichotomous coding (McNeil, 
Kelly, & McNeil, 1975); and ordinal coding (Lyons^ 1971) 
research situations. Multiple regression can also be viewed 
as a special case of path analysis. 
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PATH ANALYSIS 

Sewall Wright is credited with the development of path 
analysis as a method for studying the direct and indirect 
effects of variables (Wright^ 1921^ 1934^ 1960) • Path 
analysis is not a method for discovering causes^ rather a 
model must be specified by the researcher^ similar to 
hypothesis testing in regression analysis. The specified 
model establishes causal relationships among the variables 
when: 

a. temporal ordering exists 

b. covariation (correlation) is present 

c. other causes controlled for 

Model specification is necessary in examining multiple 
variable relationships. In the absence of a model, many 
different relationships among variables can be postulated 
with many different path coefficients being selected. For 
example, in a three variable model, the following 
relationships could be postulated: 
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The four different models have been considered without 
reversing the order of the variables. How can we decide 
which model is the correct one? Path analysis doesn't 
provide a way to specify the models but rather estimates the 
effects once the model has been specified "a priori". 

Path coefficients in path analysis take on the value 
of a product -moment correlation and/ox- standardized 
regression coefficients in a model (Wolfle^ 1977) • For 
example given model (d) : 




1 

THEN: 

P«p pap j.^p 

1 yl 2 y2 12 12 

A different set of terms is also- used to describe the 
relationships among variables. The following terminology 
should help: 

endogeneous - dependent variable 



exogenous - independent variable 

p - path coefficient 

p - path coefficient error 
e 

> - causal path 

•> - correlated path 
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A path model is specified by the researcher based on 
theory or prior reseaxr-th. Variable relationships once 
specified^ in standard score form^ become standardized 
regression coefficients. In multiple regression^ a 
dependent variable is regressed in a single analysis on all 
the independent variables, in path gmalysis one or more 
multiple regression analyses are performed. Path 
coefficients are computed based upon only the particular set 
of independent variables that lead to the dependent variable 
under consideration. As in regression analysis, path 
analysis can handle dichotomous and ordinal data, but 
special coding and interpretation is necessary (Boyle, 1970/ 
Lyons, 1971) . 

MODEL SPECIFICATION 

Path models permit diagramming how a particular set of 
independent variables lead to a dependent variable under 
consideration. How the paths are drawn determine whether 
the independent variables are correlated causes 
(unanalyzed) , mediated causes (indirect) , or independent 
causes (direct) . The model can be tested for the 
significance of path coefficients (Pedhazur, 1982, pp 58-62) 
and a goodness-of-fit criteria (Marascui^o & Levin, 1983, pp 
169-172; Tatsuoka & Lohnes, 1988, pp 98-100) which reflects 
the significance between the original and reproduced 
correlation matrix. This process is commonly called 



ERLC 



11 



I 



8 



decomposing the correlation matrix (Asher, 1976, pp 32-34) 
according to certain rules {Wright, 1934) . 
PATH ANALYSIS EXAMPLE 

A four variable model path analysis is presented in 
Appendix B. In order to calculate the path coefficients for 
the model, two regression analyses were performed. The 
model, with the path coefficients is: 



>i Y 




p « .362 
Y2 



The original and reproduced correlations are presented 
in matrix form. The upper half represents original 
correlations and the lower half the reproduced correlations 
which include only the regression of direct pathJ linking 
independent variables to the dependent variable. 



VARIABLE 

y 

X 

1 

X 

2 

X 



X 



X 



12 3 

1.000 .507 .481 .276 

.423 1.000 .224 .062 

.362 .224 1,000 .577 

.040 -.070 .593 1.000 

Reproduced 
Correlations 



Original 
Correlations 
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The original correlations can be completely "reproduced" if 
all effects: direct (DE) , indirect (IE), spurious (S) and 
correlated (C) are included. For example: 



r 

12 




P 

12 

C 














S 


.224 


r 

13 




P 

31 
DE 


+ 


P P 
32 21 

IE 












.062 


r 

23 


S3 


P 

32 
DE 


+ 


P P 
31 21 
S 












.577 


r 

lY 


tm 


P 

Yl 
DE 


+ 


P P 
Y2 21 

IE 


+ 


P P 
Y3 31 
IE 


+ 


P P P 
Y3 32 21 
IE 




.507 


r 

2Y 




P 

Y2 
DE 


+ 


P P 
Y3 32 

IE 


+ 


P P 
Yl 21 
S 


+ 


P P P 
Y3 31 21 

S 




.481 


r 

3y 




P 

Y3 
DE 


+ 


P P 
Yl 31 

S 


+ 


P F 
Y2 32 
S 


+ 


p p p + 
Yl 21 32 
S 


P p p « 

Y2 21 31 
S 


.276 



In summary, path analysis can be carried out within the 
context of ordinary regression analysis and does not require 
the learning of any new analysis techniques (Asher, 1976, 
p32/ Williams, 1974) . The advantage of path analysis is 
that it enables one to specify direct and indirect effects 
among independent variables. In addition, path analysis 
enables us to decompose the correlation between any two 
variables into fidmple and complex paths of which some are 
meaningful. Path coefficients and the relationship between 
the original and reproduced correlation matirx can also be 
tested for significance. 
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FACTOR ANALYSIS 

Path models and the associated test of significance 
between original and reproduced correlations are used in 
confirmatory factor analysis. Factor analysis assumes that 
the observed (measured) variables are linear combinations of 
some underlying source variable (factor) . In practice, one 
estimates population parameters of the measured variables 
from a sample (with the uncertainties of model specification 
and measurement error) . A linear combination of weighted 
variables relates to multiple regression in a single factor 
model and to a linear causal system (path analysis - 
"multiple" multiple regressions) in multiple factor models* 
Path diagrams therefore permit representation of the causal 
relationships among factors and observed (measured) 
variables in factor analysis. 

In general, the first step in factor analysis involves 
the study of interrelationships among- variables in the 
correlation matrix. Factor analysis will address the 
question of whether these subsets can be identified by one 
or more factors (hypothetical constructs) . Confirmatory 
factor analysis is used to test specific hypotheses 
regarding which variables correlate with which constructs. 
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FACTOR MODELS 

Factor analysis assumes that some factors, which are 
smaller in number than the number of observed variables, are 
responsible for the covariation among the observed 
variables. For example, given a unidimensional trait in a 
single factor model with four variables the diagram would be 
(Kim & Mueller, 1978a, p 35) : 

d = .735 




WHERE: 

P = Standardized regression coefficient; 
i path coefficient; or common factor 
loading 

d o residual coefficient; path error 
i coefficient; or unique factor loading 

The variance of each observed variable is therefore 

comprised of the proportion of variance determined by the 

common factor and the proportion determined by the unique 

factor, which together equal the total variance of each 

observed variable. Therefore: 

2 2 2 
S = p + d = 1 
i i i 
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The correlation between a conunon factor and a variable 
is: 

r = P 
F,X i 
i 

The correlation between a vinique factor and a variable 
is: 

r = d 
U,X i 
i 

The correlation between observed (measured) variables 
sharing a coitimon factor is: 



r «= P P 
X ,X i j 

i j 



And finally, the variance attributed to the factor as a 
result of the linear combination of variables is: 

2 

2 Z P 2 

h = i = R 

F.1234- 

M 



Where: M = number of variables 
2 

P = squared factor loadings 
i 

2 

Note: E P = eigenvalue 
i 

2 

P =» communality 
i 
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FACTOR ANALYSIS EXAMPLE 

A single factor model analysis with four variables in a 
correlation matrix format is in Appendix C. The path 
diagram is the same as above (Kim & Mueller, 1978, p 35) 
with the weights as follows: 

P = .677 p = .402 p = .800 B = .535 
Y 1 2 3 

And, factor scores computed as: 

F=py + px + px + px 

y 1 1 2 2 3 3 

Multiplying the coefficients between pairs of variables 
gives the following correlation matrix: 
COKE^LATION MATRIX 



VARIABLE 


Y 


X 


X 


X 






1 


2 


3 




2 








Y 


P 

1 


.27 

2 


.54 


.36 


1 


.27 


P 

2 


.32 

2 


.22 


2 


.54 


.32 


P 

3 


.43 

2 


3 


.36 


.22 


.43 


P 

4 



The common factor variance is: 

2 

2 2 p 

R = i = .46 + .16 + .64 + .29 = .39 
F.1234 

M 4 

The unique factor variance is: 

2 E (1 - p ) 

\ ,00/ " i = .54 + .84 + .36 + .71 = .61 
F.1234 

M 4 
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In summary^ factor loadings (variable weights) are 
standardized regression coefficients. As such^ linear 
weighted coitODinations of variables loading on a factor are 
used to compute factor scores. The weights are also the 
correlation between the observed (measured) variables and 
the factor (hypothetical construct) . if the variable 
correlations (weights) are squared and summed, they describe 
the proportion of variance determined by the common factor. 
This is traditionally known as the coefficient of 
determination^ but termed communality in factor analysis. 
When all variables are standardized, then the linear weights 
are called standardized regression coefficients (regression 
analysis), path coefficients (path analysis), or factor 
loadings (factor analysis) . The factor analysis approach is 
distinguished from regression or path analysis in that 
observed variable correlation is explained by a common 
factor (hypothetical construct) . in factor analysis 
therefore the corirelation between observed variables is the 
result of sharing a common factor rather than a variable 
being the direct cause (path analysis) or predictor of 
another (regression analysis) . 



ERLC 



IS 



15 



LISREL 

Linear structural relationships (LISREL) are often 
diagrammed by using multiple factor path models where the 
factors (hypothetical contructs) are viewed as latent traits 
(Joreskog S Sorbom, 1986, pp I. 5-1. 7) . The LISREL model 
consists of two parts: the measurement model and the 
structural equation model. The measurement model specifies 
how the latent variables .or hypothetical constructs are 
measured in terms of the observed (measured) variables and 
describes their measurement properties (reliability and 
validity) . The structural equation model specifies the 
causal relationship among the latent variables and is used 
to describe the causal effects and the amount of unexplained 
variance. The LISREL model includes or encompasses a wide 
range of models, for example; univariate or multivariate 
regression models, confirmatory factor analysis, and path 
analysis models (Joreskog & Sorbom, 1986, pp 1.3, I. 9-1. 12). 
Cuttance (1983) presents an overview of several LISREL 
submodels with diagrams and explanations. Wolfle (1982) 
presents an indepth presentation of a single model to 
introduce and clarify LISREL analysis. The LISREL program 
therefore permits regression, path, and factor analysis 
whereby model specification and measurement error can be 
assessed. 
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MEASUREMENT ERROR 

Fuller (1987) extensively covers LISREL and factor 
analysis models and especially extends regression analysis 
to the case where the variables are measured with error. 
Wolfe (1979^ pp 48-51) presents the relationship between 
LISREL^ regression and path analysis especially in regards 
to how measurement error effects the regression coefficient 
(path coefficient) . Errors of measurement in statistics has 
been studied extensively (Wolfe^ 1979) . Cochran (1968) 
studied it from four different aspects: (a) types of 
mathematical models^ (b) standard techniques of analysis 
which take into account measurement error, (c) effect of 
errors of measurement in producing bias and reduced 
precision and what remedial procedures are available, and 
(d) techniques for studying error of measurement. Cochran 
(1970) also studied the effects of error of measurement on 
the squared multiple correlation coefficient. 

LISREL-FACTOR ANi^YSIS EXAMPLE 

A LISREL factor analysis model program with a 
correlation matrix as inp^ • is given in Appendix D. The 
factor analytic model in matrix notation is: 

X « A ^ + e 

X 6 

Where: x = observed variables 

A « structural weights (factor loadings) 

c t=: latent trait (factor) 

e » error variance (unique variance) 
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The LISREL results are: 



a. A =» LAMBDA X (structural weights-factor loadings) 

Y = .677 X = .402 X = .800 X = .535 

12 3 

b. 6 = THETA DELTA (unique factor variance) 

5 

Y = .54 X = .84 X = .36 X = .71 

12 3 

2 2 

c. P = LAMBDA X (common factor variance) 



Y = .46 X = .16 X = .64 X = .29 
12 3 

The concept of model spt:cification and goodness of fit 

pertains to the original correlation matrix and the 
estimated correlation matrix. The estimated correlation 
matrix is: 

.272 

S = .542 .321 

.362 .215 .427 

The original correlation matrix is: 

.507 

S «=» .481 .224 

.276 .062 .577 

The Goodness of Fit Index (GFI) usin.j the unweighted least 

squares approach (ULS) is then computed as: 

2 

GFI = 1 - 1/2 trace (S - 2) 

2 

GFI = 1 - 1/2 (1.308 - 1.02) 
GFI = 1 - .041 
GFI = .959 
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LISREL-REGRESSION ANALYSIS EXAMPLE 

A Ll'SREL regression model program with a correlation 
matira: as input is given in Appendix E. The regression 
model ift matrix notation is; 

Y « r X -f ^ 

Where; Y = dependent variable 

r = gamma matrix (beta weights) 

X = independent variables 

C = errors of prediction (error variance) 

The LISREL results aire the same as in the previous 
regression program; 

2 

R =«rr +r r +rr 

y.l23 1 yl 2 y2 3 y3 



R = (.423) .507 + (.363) .481 + (.040) .276 

y.l23 



R 



.40 



y.l23 
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DISCRIMINANT ANALYSIS 

The general approach in both two group and multiple- 
group discriminant classification is to construct a linear 
combination of variables which optimally classifies or 
assigns subjects to known groups (Huberty, 1974/ 1975). In 
the tv group dependent variable case where only one 
discriminant function is needed, regression and discriminant 
analysis are the same (Kerlinger & Pedhazur, 1973, pp 336- 
340/ Thayer, 1986) . They are compared and presented in 
Appendix F. They differ in the multiple group case where 
more than one discriminant function is computed. 

The linear combination of weighted variables can be 
expressed as: 

L = pX +BX 

i i n n 

with B values chosen to provide maximum discrimination 

between two populations. The S>' a are constructed as linear 

combinations of the differences between variable means in 

the two groups: 

d = X - X 
j Ij 2j 

WHERE: VARIABLE GROUP MEANS d 

1 0 

X 5.2 3.6 1.6 

1 

X 2.8 2.4 .4 

2 
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The pooled sample covariance is represented as: 

n 

2 E (X - X ) (X - X ) 
jq i=l k-l ijk ij iqk iq 

a ta 

n + n - 2 
1 2 

To provide maximum discrimination, the variation of 
values in L between the two groups should be greater than 
the variation in the values of L within the two groups. In 
fact, this is just the case: 

n 

2 _ _ 2 2 j 2 

SS-Zn(L-L) SS = E2(L-L) 

B i=l i i w i=l k=l ik i 

The ratio of these two can be thought of as a measure of the 
discriminatory power of L, in the sense that the larger the 
value of sums of squares between, to sums of squares vjithin, 
the more L is reflecting between population variance as 
opposed to within population variance. 

The multiple regression and discriminant statistics are 
therefore related as: 

SS / df SS / p 



Fa B 1 



reg 1.194/ 2 



SS/df SS /n+n-p-1 1.306/ 7 

W 2 error 1 2 



nn (n + n- p-1) 
12 12 2 

(n + n)(n + n - 2)p 
12 12 
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SS 

2 reg 2 

R a a c * D « .1632 (2.928) = .4778 

SS 
total 

(note: SS <= Npq = 2.5) 
total 



The Mahalanobis D-squared and constant value are 
computed as: 



2 p p jq 
D « EE d d s 
j«l q»l j q 



(n+n)(n + n-2) 2 
12 12 R 

* ^ 2.928 

n n 2 
12 1 - R 

n n 
1 2 



n + n n n 

1 2 1 2 2 2.5 

c ■=■ + *.p „ „ .1632 

(n + n - 2) n + n 15.319 

12 12 

Regression weights are compared to discriminant weights 

(p = regression weights; b=discriminant weights) as: 

P 

i 

b = — ~ 
i c 



THEN: 



.1389 .1204 

b = «= .8511 b = • « .7375 

1 .1632 2 .1632 



25 



I 

t 



22 



AlTD: 

b = -.5 ( L + L ) = -.5(7.080 + 4.153) = - 5.617 



L=bX+bX = .8511(5.2) + .7375(3.6) = 7.081 

1 1 11 2 10 

L«bX+bX « .8510(2.8) + .7375(2.4) = 4.153 

2 1 21 2 20 

The formulae indicate that the regression procedure can 
be used to produce a linear combination of weights which 
only differ by a constant value c (the choice of coding 
values for the dependent variable will change the value of 
c) . The quantity D-squared is called Mahalanobis D-squared 
and it represents a mersure of the distance between tv70 
means . 

An extension t( the multiple group discriminant case 
using eigenvalues also relates the sums of squares approach 
to several multivariate statistics (Marascuilo & Levin, 
1983, Chapters 7 and 8) . For the two group case 
( X = .91489) : 

a. Roy's criterion 



0 



1 



2 



WHERE: 




SS 

B 2 
= R = .4778 



SS 

T 
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b. Fisher's F 

N - p - 1 e 
F = = 3^20 

p - - e 

2 

c. Hotelling's T 

2 (N - 2) 0 

T =. = 7.32 

1-9 

Cautionary Remark 



Mueller & Cozad (1988) discuss standardization 
procedures used in SPSSX, BMDP, and SAS to determine 
standardized discriminant coefficients, rt'hey indicated that 
the within-group variance (SPSSX, BMDP) should be used 
rather than the total variance (SAS) because it removes 
between-group differences from the estimate. Moreover, 
because standardized weights are computed differently it 
causes erroneous interpretations of results (SPSSX and BMDP 
use the diagonal elements of the within-group covariance 
matrix; SAS uses the diagonal elements of the total 
covariance matrix) . These major "canned" statistical 
programs have inconsistencies between them and also within 
them. 
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CONCLUSION 

The appropriate statistical method to use is often an 
issue of debate* It sometimes requires more than one 
approach to analyze data. The rationale for choosing 
between the alternative methods of analysis is usually 
guided by: 

a. purpose of the research 

b. research hypothesis or question 

c. mathematical characteristics of the variables 

d. sampling procedures 

e. statistical assumptions 

f. model validity 

The multivariable methods discussed in this paper have 
in common the general linear model and are the same in 
several respects. First, they identify, partition, and 
control variance. Second, they are based on linear 
combinations of variables. And third, the linear weights 
can be computed based on standardized partial regression 
coefficients. 

The^^ultivarlable -methods howeve r - have different 
applications. Multiple regression seeks to identify and 
estimate the amount of variance in the dependent variable 
attributed to one or more independent variables 
(prediction) . Path analysis seeks to identify relationships 
among a set of variables (explanation) . Factor anlaysis 
seeks to idenfify subsets of variables from a much larger 
set (common/shared variance) . LISREL determines the degree 
of model specification and measurement error. Discriminant 
analysis seeks to" identify a linear combination of variables 
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which can be used to assign subjects to groups 
(classification) • The different methods were derived 
because of the need for prediction^ explanation^ common 
variance^ model and measurement error assessment^ and 
classification type applications* 

Multiple Regression techniques are robust except for 
model specification and measurement errors (Borhnstedt^ 
1971) • Multiple regression techniques are useful in 
tmderstanding path^ factor^ LISREL^ and discriminant 
applications. LISREL permits regression, path, and factor 
analyses whereby model specification and measurement error 
can be assessed. LISREL also permits univariate or 
multivariate, least squares analysis in either single sample 
or multiple sample (across populations) research settings. 
An understanding of multiple regression and general linear 
model techniques can therefore greatly facilitate one's 
understanding of the testing of research questions in 
muljy.yariable situations . 

Multiple linear regression is also related to canonical 
correlation analysis, under which all parametric tests are 
subsumed as special cases (Knapp, 1978; Marascuilo & Levin, 
1983) • A recent presentation suggested that multivariate 
analyses are really univariate analyses and further 
illustrates that an understanding of multiple regression 
facilitates an understanding of multivariable methods 
(Newman, 1988) • Some authors have presented multivariate 
analysis of variance using multiple regression methods 
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(Woodward & Overall, 1975), while other authors (Huberty & 
Morris, 1987) present an argument for a truly multivariate 
analysis. 

As a final comment, it is well Icnown that the 
correlation matrix has a central role in the analysis of 
multivariable data. In fact, it was used in the ntamerous 
computer program examples which assumed standardized 
variables. The inverse of the correlation matrix, however, 
also has important interpretations in multiple regression, 
factor and discriminant analyses (Raveh, 1985) . Two main 
roles are: (a) near a diagonal matrix as p, the number of 
variables, increases in order for factor analysis to be 
meaningful; and (b) the estimated coefficients in multiple 
regression and discriminant analysis are obtained from the 
inverse matrix and thus conditioned on p specific variables. 
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APPENDICES 

The followa.ng appendices contain computer program 
examples based upon correlational input (SPSSX User's Guide^ 
3rd ed.^ Chapter 13^ 1988) with the exception of 
discriminant analysis. These programs were run on a 
mainframe computer (with some modification they can also run 
on the personal computer version) • A PASCAL prograin was 
written and compiled to generate random variables (Borland, 
1988), and is in APPENDIX 6. The random variables were then 
correlated using a SAS PC program (SAS, 1988) in APPENDIX H. 
Although random data were generated, the relationships and 
principles presented in this paper also apply to research 
data. 
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APPENDIX A 



MULTIPLE REGRESSION ANALYSIS PROGRAM 



TITLE REGRESSION WITH CORRELATION MATRIX INPUT 
COMMENT VARIABLE MEANS=0; VARIANCES=1; CONSTANT=0 
MATRIX DATA VARIABLES=Y XI X2 X3/N=100 
BEGIN DATA 
1.000 

.507 1.000 

.481 .224 1.000 

.276 .062 .577 1.000 
END DATA 

REGRESSION MATRIX=«IN (*) / 

MISSING^LISTT'JISE/ 
VAR1ABLES»Y XI X2 X3/ 
DBPENDENT»Y/ 
ENTER XI X2 X3/ 

FINISH 



^ 3 2 

RJC 



APPENDIX B 



PATH ANALYSIS PR0GR2\M 



A. VARIABLE 3 REGRESSED ON VARIABLES 1 AND 2 



TITLE PATH ANALYSIS EXAMPLE WITH CORRELATION MATRIX INPUT 
COMMENT VARIABLE MEANS=0; VARIANCES=1; CONSTANT=0 
MATRIX DATA VARIABLES=:Y XI X2 X3/N=100 
BEGIN DATA 
1.000 

.507 1.000 

.481 .224 1.000 

.276 .062 .577 1.000 
END DATA. 

REGRESSION MATRIX=IN ( * ) / 

MISSING=LISTWISE/ 
VARIABLES=Y XI X2 X3/ 
DEPENDENT=X3/ 
ENTER XI X2/ 

FINISH 



B. VARIABLE Y REGRESSED ON VJVRIABLES 1, 2, AND 3 

TITLE PATH ANALYSIS EXAMPLE WITH CORRELATION MATRIX INPUT 
COMMENT VARI2VBLE MEANS=»0; VARIANCES^l; CONSTANT«0 
MATRIX DATA VARIABLES=Y XI X2 X3/N=»100 
BEGIN DATA 
1.000 

— r507- IvOOO ^ 

.481 .224 1.000 

.276 .062 .577 1.000 
END DATA 

REGRESSION MATRIX=IN ( * ) / 

MI S S ING=>L1 S TWI SE / 
VARIABLES«Y XI X2 X3/ 
DEPENDENT=Y/ 
ENTER XI X2 X3/ 

FINISH 
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APPENDIX C 



FACTOR ANALYSIS PROGRAM 



TITLE FACTOR ANALYSIS EXAMPLE WITH CORRELATION MATRIX INPUT 

COMMENT VARIABLE J<1EANS«=0; VARIANCES=1; CONSTANT=0 

MATRIX DATA VARIABLES=Y XI X2 X3/N=100 

BEGIN DATA 

1.000 

.507 1.000 

.481 .l:>4 1.000 

.276 .062 .577 1.000 
END DATA 

FACTOR VARIABLES=Y XI X2 X3/ 
MATRIXaIN (COR=*) / 
CRITERIA=FACTORS (1) / 
EXTRACTIONaULS / 
ROTAT I ON-NOROTATE / 

PRINT CORRELATION DET INITIAL EXTRACTION ROTATION/ 

FORMAT SORT/ 

PLOT-EIGEN/ 

FINISH 
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APPENDIX D 

LISREL AN2^ySIS PROGRAM 



TITLE 'LISREL FACTOR ANALYSIS WITH CORRELATION MATRIX INPUT' 
INPUT PROGRAM 
NUMERIC DUMMY 
END FILE 

END INPUT PROGRAM 
USERPROC NAME=LISREL 
DATA FOR GROUP ONE 
DA NG=1 NI"4 NO=«100 
LA 

'Y' 'XI' 'X2' 'X3' 

KM sy 

1.000 

.507 1.000 

.481 .224 1.000 

.276 .062 .577 1.000 
MO NX=4 NK=1 TDbDI^FR PH=ST 
LK 

'FACTOR' 
PA LX 
4*1 

OU ULS SE TV PC RS VA FS SS MI 
END USER 
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APPENDIX E 



LISREL REGRESSION ANALYSIS PROGRAM 



TITLE 'LISREL REGRESSION ANALYSIS WITH CORRELATION MATIRX' 
INPUT PROGRAM 
NUMERIC DUMMY 
END FILE 

END INPUT PROGRAM 
USERPROC NAME»LISREL 
DATA FOR GROUP ONE 
DA NG=1 NI=4 NO=100 
i'iA 

'Y' 'XI' 'X2' 'X3' 

KM SY 

1.000 

.507 1.000 

.481 .224 1.000 

.276 .062 .577 1.000 
MO NY°1 NX«3 PS«DI 
OU ULS SE TV PC RS VA SS MI TO 
END USER 
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APPENDIX F 



DISCRIMINANT JjNALYSIS 

A. DISCRIMINANT ANALYSIS VIA REGRESSION PROGRAM 

TITLE REGRESSION ANALYSIS WITH DIGHOTOMOUS DEPENDENT 

DATA LIST REC0RDS=1 /I Y 1 XI 3 X2 5 

BEGIN DATA 

18 3 

17 4 

15 5 

13 4 

13 2 

0 4 2 . 

0 3 1 

0 3 2 

0 2 2 

0 2 5 

REGRESSION VARIABLES- Y XI X2/ 

DEPENDENT<=Y/ 

ENTER XI X2/ 

SAVE PRED (PSCORE)/ 
PRINT /I Y XI X2 PSCORE 
EXECUTE 
FINISH 



B. DISCRIMINANT ANALYSIS VIA DISCRIMINANT PROGRAM 

TITLE DISCRIMINANT ANALYSIS WITH DICHOTOMOUS DEPENDENT 

DATA LIST RECORDS-1/1 Y 1 XI 3 X2 5 ■■ 

BEGIN DATA 

18 3 

17 4 

15 5 

13 4 

13 2 

0 4 2 

0 3 1 

0 3 2 

0 2 2 

0 2 5 

DISCRIMINANT GROUPS" Y (0, 1) /VARIABLES =X1 X2/ANALYSIS=X1 X2/ 

METHOD=D IRECT/ SAVE=CLASS=PRDV/ 

STATISTICS 11 12 13/ 
COMPUTE YHAT» -5.617 + .8510 * XI + .7375 * X2 
PRINT /I Y XI X2 PRDY YHAT 
EXECUTE 
FINISH 
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APPENDIX G 



a 

PASCAL PROGRAM 



program ran/ 
const 

ns«100; 
rxy««0.5; 
al«2. 505922 ; 
a3=-15. 73223; 
a5«23. 54337 ; 
b2«-7.337743; 
b4«14. 97266; 
b6t=x-6. 016088; 

var 

all^alaq^xl^x2f x3,yl^y2^y3rzl,br,r : real; 
ix : longint; 
outl : text; 
k: integer; 

procedure gauss ( var pgbr:real;var pgix : longint); 
var pgr: real; 

pgiy: longint; 

procedure* randu (prix: longint; var priy: longint; var 
yfl:real) ; 
begin 

priy:«0;yfl:«0; 
priy : =pr ix* 6553 9 ; 
if priy<0 then 

priy:«=T>riy+2147483647+l; 

yfl:»priy; 

yfl:«yfl*0.4656613e-9; 

end; 

{see J. A. Byars & J. Roscoe for algorithm explanation} 

begin 

pgiy:»0; pgrraO; 
randu (pgix, pgiy, pgr) ; 
pgix:«pgiy; 
pgr:«pgr-0.5; 
q:ttpgr*pgr; 

pgbr :« ( (al+ (a3+a5*q) *q) *pgr) / (1 ^ (b2+ (b4+b6*q) '^q) *q) ; 



a 

Special acknowledgement to Miguel Monsivais, a doctoral 

educational research who converted a prior 
FORTRAN program into Pascal code. 



35 



APPENDIX G (CONTINUED) 
{ see T. Knapp & V. Swoyer for algorithm explanation } 

begin 
ix:"0; 

assign (outl^ 'corr.dat' ) / 
rewrite (outl) ; 
ix:«16875423; 
alsq:orxy*rxy/ 

all :«sqrt (l-alsq> ; 
for k:»l to ns do 
begin 

xl:"0/yl:eO;zl:wO;br:«0/ 
gauss (br^ix); 
xl:»br; 
br :=0; 

gauss (br^ ix) ; 
zl:-«br; 

yl : - (rxy*xl) + (all*zl) ; 

gauss (br^ ix) ; 
x2:"br; 

y2 :«« (rxy*yl) + (all*x2) ; 

gauss (br^ ix) ; 
x3:»br; 

y3:»(rxy*y2) + (all<^x3) ; 

writeln (outl, yl : 10 : 6, xl : 10 : 6, y2 : 10 : 6, y3 : 10 : 6) ; 
end; 
close (outl) ; 
end. 
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APPENDIX H 



A. SAS PC CORRELATION PROGRAM 
data a; 

infile 'c:corr,dat' ; 
input y xi x2 x3 00/ 
proc corr;var y xl x2 x3/ 
run; 
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